Computing Similarity between RNA Strings

نویسندگان

Vineet Bafna

S. Muthukrishnan

R. Ravi

چکیده

Ribonucleic acid (RNA) strings are strings over the four-letter alphabet {A, C, G, U} with a secondary structure of base-pairing between A U and C G pairs in the string 1 . Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing. The noncrossing base-pairing naturally leads to a tree-like representation of the secondary structure of RNA strings. In this paper, we address several notions of similarity between two RNA strings that take into account both the primary sequence and secondary base-palring structure of the strings. We present efficient algorithms for exact matching and approximate matching between two RNA strings. We define a notion of alignment between two RNA strings and devise algorithms based on dynamic programming. We then present a method for optimally aligning a given RNA string with unknown secondary structure to one with known sequence and structure, thus attacking the structure prediction problem in the case when the structure of a closely related sequence is known. The techniques employed to prove our results include reductions to well-known string matching problems allowing wild cards and ranges, and speeding up dynamic programming by using the tree structures implicit in the secondary structure of RNA strings.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computing Similarity between Rna Strings 1

Ribonucleic acid (RNA) strings are strings over the four-letter alphabet fA; C; G; Ug with a secondary structure of base-pairing between A 0 U and C 0 G pairs in the string. Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing. The noncrossing base-pairing naturally leads to a tree-like representation of t...

متن کامل

Local Alignment of RNA Sequences with Arbitrary Scoring Schemes

Local similarity is an important tool in comparative analysis of biological sequences, and is therefore well studied. In particular, the Smith-Waterman technique and its normalized version are two established metrics for measuring local similarity in strings. In RNA sequences however, where one must consider not only sequential but also structural features of the inspected molecules, the concep...

متن کامل

13 Comparative RNA analysis

• R. Durbin, S. Eddy, A. Krogh und G. Mitchison, Biological sequence analysis, Cambridge, 1998 • D.W. Mount. Bioinformatics: Sequences and Genome analysis, 2001. • V. Bafna, S. Muthukrishnan, R. Ravi, Computing similarity between RNA strings. • D. Sankoff, Simultaneous solution of the RNA Folding , Alignment and Protosequence Problems, SIAM Journal of Appl. Math., 45,5,1985 • J. Gorodkin, L.J. ...

متن کامل

\recent Methods for Rna Modeling Using Stochastic Context-free Grammars," Proc. Combinatorial Pattern

Ribonucleic acid (RNA) strings are strings over the four-letter alphabet fA;C;G;Ug with a secondary structure of base-pairing between A U and C G pairs in the string 1 . Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing. The noncrossing base-pairing naturally leads to a tree-like representation of the s...

متن کامل

Harry: A Tool for Measuring String Similarity

Comparing strings and assessing their similarity is a basic operation in many application domains of machine learning, such as in information retrieval, natural language processing and bioinformatics. The practitioner can choose from a large variety of available similarity measures for this task, each emphasizing different aspects of the string data. In this article, we present Harry, a small t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1995

Computing Similarity between RNA Strings

نویسندگان

چکیده

منابع مشابه

Computing Similarity between Rna Strings 1

Local Alignment of RNA Sequences with Arbitrary Scoring Schemes

13 Comparative RNA analysis

\recent Methods for Rna Modeling Using Stochastic Context-free Grammars," Proc. Combinatorial Pattern

Harry: A Tool for Measuring String Similarity

عنوان ژورنال:

اشتراک گذاری